{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "78b45321-7740-4399-b2ad-459811131de3",
   "metadata": {},
   "source": [
    "# How to get log probabilities\n",
    "\n",
    ":::info Prerequisites\n",
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
    "- [Chat models](/docs/concepts/#chat-models)\n",
    "\n",
    ":::\n",
    "\n",
    "Certain chat models can be configured to return token-level log probabilities representing the likelihood of a given token. This guide walks through how to get this information in LangChain."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7f5016bf-2a7b-4140-9b80-8c35c7e5c0d5",
   "metadata": {},
   "source": [
    "## OpenAI\n",
    "\n",
    "Install the LangChain x OpenAI package and set your API key"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "fe5143fe-84d3-4a91-bae8-629807bbe2cb",
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install -qU langchain-openai"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "fd1a2bff-7ac8-46cb-ab95-72c616b45f2c",
   "metadata": {},
   "outputs": [],
   "source": [
    "import getpass\n",
    "import os\n",
    "\n",
    "os.environ[\"OPENAI_API_KEY\"] = getpass.getpass()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f88ffa0d-f4a7-482c-88de-cbec501a79b1",
   "metadata": {},
   "source": [
    "For the OpenAI API to return log probabilities we need to configure the `logprobs=True` param. Then, the logprobs are included on each output [`AIMessage`](https://api.python.langchain.com/en/latest/messages/langchain_core.messages.ai.AIMessage.html) as part of the `response_metadata`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "d1bf0a9a-e402-4931-ab53-32899f8e0326",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[{'token': 'I', 'bytes': [73], 'logprob': -0.26341408, 'top_logprobs': []},\n",
       " {'token': \"'m\",\n",
       "  'bytes': [39, 109],\n",
       "  'logprob': -0.48584133,\n",
       "  'top_logprobs': []},\n",
       " {'token': ' just',\n",
       "  'bytes': [32, 106, 117, 115, 116],\n",
       "  'logprob': -0.23484154,\n",
       "  'top_logprobs': []},\n",
       " {'token': ' a',\n",
       "  'bytes': [32, 97],\n",
       "  'logprob': -0.0018291725,\n",
       "  'top_logprobs': []},\n",
       " {'token': ' computer',\n",
       "  'bytes': [32, 99, 111, 109, 112, 117, 116, 101, 114],\n",
       "  'logprob': -0.052299336,\n",
       "  'top_logprobs': []}]"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from langchain_openai import ChatOpenAI\n",
    "\n",
    "llm = ChatOpenAI(model=\"gpt-3.5-turbo-0125\").bind(logprobs=True)\n",
    "\n",
    "msg = llm.invoke((\"human\", \"how are you today\"))\n",
    "\n",
    "msg.response_metadata[\"logprobs\"][\"content\"][:5]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d1ee1c29-d27e-4353-8c3c-2ed7e7f95ff5",
   "metadata": {},
   "source": [
    "And are part of streamed Message chunks as well:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "4bfaf309-3b23-43b7-b333-01fc4848992d",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[]\n",
      "[{'token': 'I', 'bytes': [73], 'logprob': -0.26593843, 'top_logprobs': []}]\n",
      "[{'token': 'I', 'bytes': [73], 'logprob': -0.26593843, 'top_logprobs': []}, {'token': \"'m\", 'bytes': [39, 109], 'logprob': -0.3238896, 'top_logprobs': []}]\n",
      "[{'token': 'I', 'bytes': [73], 'logprob': -0.26593843, 'top_logprobs': []}, {'token': \"'m\", 'bytes': [39, 109], 'logprob': -0.3238896, 'top_logprobs': []}, {'token': ' just', 'bytes': [32, 106, 117, 115, 116], 'logprob': -0.23778509, 'top_logprobs': []}]\n",
      "[{'token': 'I', 'bytes': [73], 'logprob': -0.26593843, 'top_logprobs': []}, {'token': \"'m\", 'bytes': [39, 109], 'logprob': -0.3238896, 'top_logprobs': []}, {'token': ' just', 'bytes': [32, 106, 117, 115, 116], 'logprob': -0.23778509, 'top_logprobs': []}, {'token': ' a', 'bytes': [32, 97], 'logprob': -0.0022134194, 'top_logprobs': []}]\n"
     ]
    }
   ],
   "source": [
    "ct = 0\n",
    "full = None\n",
    "for chunk in llm.stream((\"human\", \"how are you today\")):\n",
    "    if ct < 5:\n",
    "        full = chunk if full is None else full + chunk\n",
    "        if \"logprobs\" in full.response_metadata:\n",
    "            print(full.response_metadata[\"logprobs\"][\"content\"])\n",
    "    else:\n",
    "        break\n",
    "    ct += 1"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "19766435",
   "metadata": {},
   "source": [
    "## Next steps\n",
    "\n",
    "You've now learned how to get logprobs from OpenAI models in LangChain.\n",
    "\n",
    "Next, check out the other how-to guides chat models in this section, like [how to get a model to return structured output](/docs/how_to/structured_output) or [how to track token usage](/docs/how_to/chat_token_usage_tracking)."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
